I Tried a Bunch of Things: the Dangers of Unexpected Overfitting in Classification

نویسندگان

  • MICHAEL SKOCIK
  • JOHN COLLINS
  • CHLOE CALLAHAN-FLINTOFT
  • HOWARD
  • BRAD WYBLE
چکیده

Machine learning is a powerful set of techniques that has enhanced the abilities of neuroscientists to interpret information collected through EEG, fMRI, MEG, and PET data. With these new techniques come new dangers of overfitting that are not well understood by the neuroscience community. In this article, we use Support Vector Machine (SVM) classifiers, and genetic algorithms to demonstrate the ease by which overfitting can occur, despite the use of cross validation. We demonstrate that comparable and non-generalizable results can be obtained on informative and non-informative (i.e. random) data by iteratively modifying hyperparameters in seemingly innocuous ways. We recommend a number of techniques for limiting overfitting, such as lock boxes, blind analyses, and pre-registrations. These techniques, although uncommon in neuroscience applications, are common in many other fields that use machine learning, including computer science and physics. Adopting similar safeguards is critical for ensuring the robustness of machine-learning techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Home appliances energy management based on the IoT system

The idea of the Internet of Things (IoT) has turned out to be increasingly prominent in the cuttingedge period of innovation than at any other time. From little family unit gadgets to extensive modernmachines, the vision of IoT has made it conceivable to interface the gadgets with the physical worldaround them. This expanding prominence has likewise made the IoT gadgets and ap...

متن کامل

An Approach to Reducing Overfitting in FCM with Evolutionary Optimization

Fuzzy clustering methods are conveniently employed in constructing a fuzzy model of a system, but they need to tune some parameters. In this research, FCM is chosen for fuzzy clustering. Parameters such as the number of clusters and the value of fuzzifier significantly influence the extent of generalization of the fuzzy model. These two parameters require tuning to reduce the overfitting in the...

متن کامل

A Survey of Anomaly Detection Approaches in Internet of Things

Internet of Things is an ever-growing network of heterogeneous and constraint nodes which are connected to each other and the Internet. Security plays an important role in such networks. Experience has proved that encryption and authentication are not enough for the security of networks and an Intrusion Detection System is required to detect and to prevent attacks from malicious nodes. In this ...

متن کامل

Orchard Management for Decreasing Date Palm Bunch Fading Disorder

The date palm bunch fading disorder/disease is one of the greatest challenges faced by date palm growers. In the present study, the effect of appropriate orchard management on some qualitative and quantitative features of date palm bunch was studied. For this purpose, two orchards of cv ‘Kabkab’ with a history of previous incidence were selected in two districts of Bushehr province; Tangestan a...

متن کامل

Intrusion Detection in IOT based Networks Using Double Discriminant Analysis

Intrusion detection is one of the main challenges in wireless systems especially in Internet of things (IOT) based networks. There are various attack types such as probe, denial of service, remote to local and user to root. In addition to known attacks and malicious behaviors, there are various unknown attacks that some of them have similar behavior with respect to each other or mimic the norma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016